- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.1k
[v1.13.0] Use mincore(2) to create diff snapshots without dirty page tracking #5274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1.13.0] Use mincore(2) to create diff snapshots without dirty page tracking #5274
Conversation
0622764    to
    7379503      
    Compare
  
    | Codecov ReportAttention: Patch coverage is  
 
 Additional details and impacted files@@            Coverage Diff             @@
##             main    #5274      +/-   ##
==========================================
+ Coverage   82.92%   83.00%   +0.08%     
==========================================
  Files         250      250              
  Lines       26844    26879      +35     
==========================================
+ Hits        22260    22312      +52     
+ Misses       4584     4567      -17     
 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
65690d4    to
    9429bc6      
    Compare
  
    20e2ca8    to
    2fe1ba9      
    Compare
  
    c5a6f74    to
    fbb2e4d      
    Compare
  
    31e6169    to
    5199aa0      
    Compare
  
    f61fb5b    to
    658de5a      
    Compare
  
    Include a sentence about diff snapshots producing sparse files in our snapshot documentation. Signed-off-by: Patrick Roy <[email protected]>
The rebase-snap tool is deprecated and snapshot-editor should be used. Since the two ways of merging snapshot files are completely equivalent, let's remove the rebase-snap instructions and only keep the snapshot-editor ones, to avoid people using deprecated tools. Signed-off-by: Patrick Roy <[email protected]>
Currently, Firecracker only supports creation of diff snapshots if dirty page tracking is explicitly enabled. Allow creation of diff snapshots even if it is not enabled, through the use of mincore(2). The mincore(2) syscalls determines which pages of a VMA are "in core". For anonymous mappings (as used by booted VMs without vhost-user devices), this refers to all pages that are currently faulted in. For memfd (as used by booted vms with vhost-user devices), this means all pages that have been allocated into the memfd, regardless of whether they were allocated through the VMA on which mincore(2) was called (meaning creation of mincore-diff-snapshots will correctly account for pages that were only touched by the vhost-user backend, but not by Firecracker or KVM). For restored VMs, this means all pages of the underlying snapshot file that have been faulted in. Note that this only works if swap has been disabled, as pages currently swapped to disk do not count as "in-core", yet obviously should be included in a diff snapshot. If swap is used, dirty page tracking MUST be enabled for diff snapshots to work correctly. Compared to diff snapshots based on dirty page tracking, mincore-based diff snapshots will be slightly larger. This is because dirty page tracked diff snapshots only include pages that were actually written to, while mincore-based snapshots will contain all pages that were accessed at all, e.g. even if only for reading. Signed-off-by: Patrick Roy <[email protected]>
We have two different API parameters that do the same thing in two different APIs: For booted VMs, we enable KVM dirty page tracking by setting track_dirty_pages to true in /machine-config. For resumed VMs, we enable KVM dirty page tracking by setting enable_diff_snapshots to true in /snapshot/load. Apart from being inconsistent for no reason, one of these is very badly named: With support for diff snapshots without dirty page tracking, enable_diff_snapshots is a misnomer, because setting it to false will still allow us to do diff snapshots, it'll just fall back to mincore. Rename enable_diff_snapshots to track_dirty_pages, so we're consistent and also so that the parameter reflects what is actually happening. Go through the whole deprecation cycle of deprecating enable_diff_snapshots and adding the new track_dirty_pages parameter at the same time. We will need to remove enable_diff_snapshots in 2.0 Signed-off-by: Patrick Roy <[email protected]>
Document that diff snapshots can work even without dirty page tracking, and what the drawbacks of this are (bigger snapshots, no swap support). Signed-off-by: Patrick Roy <[email protected]>
We were using this property as a proxy for two things: "Does this snapshot rebase rebasing before resumptions?" and "do I need to enable dirty page tracking to create another snapshot of this type?". With mincore-snapshots, these two questions are no longer equivalent (mincore snapshots need rebase, but do _not_ need dirty page tracking), so untangle this property into two properties. Signed-off-by: Patrick Roy <[email protected]>
With support for "mincore diff snapshots", we'll have two snapshot types that map to "diff" in the firecracker API, so the enum can no longer be string based. Instead just have a property that translates to Firecracker API types. Signed-off-by: Patrick Roy <[email protected]>
Just always demand the enum. The internal conversion from str to enum now doesnt work anymore with the enum no longer being str-based. Signed-off-by: Patrick Roy <[email protected]>
Have them use the correct needs_rebase/needs_dirty_tracking helpers instead. Signed-off-by: Patrick Roy <[email protected]>
Now that the test framework correctly differentiates between the need for rebase and the need for dirty page tracking, start running tests with mincore snapshots. Signed-off-by: Patrick Roy <[email protected]>
Just have a single test that is parametrized by snapshot type. This will then automatically also test mincore diff snapshots. Signed-off-by: Patrick Roy <[email protected]>
Fix various minor errors: - Drop some specifics on the cgroups v1 disclaimer, because all supported host kernel versions are "5.4+" - Do not claim that creating a snapshot has no effect on the running VM, because that's not true. - Cut down on some repeated and confusing information / examples near the end. Signed-off-by: Patrick Roy <[email protected]>
Extend the limitation section of the huge pages docs to include a note about how dirty page tracking negates the benefits of huge pages. Also add a cross-reference to the diff snapshot docs. Signed-off-by: Patrick Roy <[email protected]>
At the time of writing the docs, that was temporarily the case in v1.6.0, but was reverted in scenarios where no vhost-user devices are attached due to performance regressions. Signed-off-by: Patrick Roy <[email protected]>
658de5a    to
    4cdcc76      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Currently, Firecracker only supports creation of diff snapshots if dirty
page tracking is explicitly enabled. Allow creation of diff snapshots
even if it is not enabled, through the use of mincore(2). The mincore(2)
syscalls determines which pages of a VMA are "in core". For anonymous
mappings (as used by booted VMs without vhost-user devices), this refers
to all pages that are currently faulted in. For memfd (as used by booted
vms with vhost-user devices), this means all
pages that have been allocated into the memfd, regardless of whether
they were allocated through the VMA on which mincore(2) was called
(meaning creation of mincore-diff-snapshots will correctly account for
pages that were only touched by the vhost-user backend, but not by
Firecracker or KVM). For restored VMs, this means all pages of the
underlying snapshot file that have been faulted in.
Note that this only works if swap has been disabled, as pages currently
swapped to disk do not count as "in-core", yet obviously should be
included in a diff snapshot. If swap is used, dirty page tracking MUST
be enabled for diff snapshots to work correctly.
Compared to diff snapshots based on dirty page tracking, mincore-based
diff snapshots will be slightly larger. This is because dirty page
tracked diff snapshots only include pages that were actually written to,
while mincore-based snapshots will contain all pages that were accessed
at all, e.g. even if only for reading.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.